Leveraging on the recent developments in convolutional neural networks(CNNs), matching dense correspondence from a stereo pair has been cast as alearning problem, with performance exceeding traditional approaches. However,it remains challenging to generate high-quality disparities for the inherentlyill-posed regions. To tackle this problem, we propose a novel cascade CNNarchitecture composing of two stages. The first stage advances the recentlyproposed DispNet by equipping it with extra up-convolution modules, leading todisparity images with more details. The second stage explicitly rectifies thedisparity initialized by the first stage; it couples with the first-stage andgenerates residual signals across multiple scales. The summation of the outputsfrom the two stages gives the final disparity. As opposed to directly learningthe disparity at the second stage, we show that residual learning provides moreeffective refinement. Moreover, it also benefits the training of the overallcascade network. Experimentation shows that our cascade residual learningscheme provides state-of-the-art performance for matching stereocorrespondence. By the time of the submission of this paper, our method ranksfirst in the KITTI 2015 stereo benchmark, surpassing the prior works by anoteworthy margin.
展开▼